Population Growth & Active Transportation Needs Analysis

Introduction

The aim of this usecase is to conduct a comprehensive analysis of population growth trends across different regions within the City of Melbourne. The objective is to discern potential needs for enhancements in transportation services across various modes available throughout the city. As population densities evolve, ensuring commensurate provisions of stations and stops becomes imperative to facilitate public mobility to and from diverse locales. The overarching goal is to identify areas projected to experience significant population surges and to ascertain corresponding requirements for expanded transportation infrastructure in anticipation of such demographic shifts.

User Story:

What this use case will teach you

Analyzing population growth and sufficient amount of transportation services.

Using "City of Melbourne Population Forecasts by Small Area 2021-2041" dataset we will look at the forecasted population growth based on various features like location, year and value. This will help us understand what has been forecasted population growth in particular locations for particular years and the value of the growth in those areas. Using "City Circle tram stops" dataset we will look into the existing tram stops based on the locations where they exist. This will give us an insight into the number of tram stops in particular location. Using geolocation we can visualize the exact location of these tram stops to aid us into pointing out potential locations for future tram stops. Using "Bus stops" dataset we can see the location of bus stops within Melbourne city, thsi data can aid us to understand the existing number of bus stops in particular locations and if there are need for any more in the future due to the growth in the population. Dataset "Metro Train Stations with accessibility information" gives information about the number of metro stations in various location in Melborune city. This dataset also contains the geolocations of the existing metro stations and using this we can predict if more metro stations will be needed in the future due to the growth in population. The initial approach would be to get the basic information about the dataset and understand the features within each dataset. Later figuring out which features are necessary and discarding the rest that are uncessary and only add to the number of dimensionality of the data. Once the features are understood, the goal will be to clean and remove null values with appropriate tools and techniques. After cleaning, a initial basic visualition to understand the overall distributions of the featues and finding out any correlations amongst them. Finally, using machine learning models to predict the number of transportation services that might be needed to aid the population growth within the city in particular locations. Potenitally, using PowerBI to the convey insights to the stakeholders/business owners/policy makers to help them understand the rise in population and their need for public transportation services, thus helping them plan for such resources beforehand.

Collecting data from the source

17052
geography year gender age value
0 City of Melbourne 2022 Female Age 0-4 2212.0
1 City of Melbourne 2024 Female Age 0-4 2818.0
2 City of Melbourne 2029 Female Age 0-4 4310.0
3 City of Melbourne 2031 Female Age 0-4 4736.0
4 City of Melbourne 2032 Female Age 0-4 4931.0
28
geo_point_2d geo_shape name xorg stop_no mccid_str xsource xdate mccid_int
0 -37.81922319307822, 144.9614014008424 {"coordinates": [144.9614014008424, -37.819223... Market Street / Flinders Street GIS Team 3 NaN Mapbase 2011-10-18 3
1 -37.821539117626855, 144.95356912978238 {"coordinates": [144.95356912978238, -37.82153... Victoria Police Centre / Flinders Street GIS Team D6 NaN Mapbase 2011-10-18 6
2 -37.815426586135686, 144.94512063442602 {"coordinates": [144.94512063442602, -37.81542... Central Pier / Harbour Esplanade GIS Team D2 NaN Mapbase 2011-10-18 10
309
geo_point_2d geo_shape prop_id addresspt1 addressp_1 asset_clas asset_type objectid str_id addresspt asset_subt model_desc mcc_id roadseg_id descriptio model_no
0 -37.80384165792465, 144.93239283833262 {"coordinates": [144.93239283833262, -37.80384... 0 76.819824 357 Signage Sign - Public Transport 355 1235255 570648 NaN Sign - Public Transport 1 Panel 1235255 21673 Sign - Public Transport 1 Panel Bus Stop Type 13 P.16
1 -37.81548699581418, 144.9581794249902 {"coordinates": [144.9581794249902, -37.815486... 0 21.561304 83 Signage Sign - Public Transport 600 1231226 548056 NaN Sign - Public Transport 1 Panel 1231226 20184 Sign - Public Transport 1 Panel Bus Stop Type 8 P.16
2 -37.81353897396532, 144.95728334230756 {"coordinates": [144.95728334230756, -37.81353... 0 42.177187 207 Signage Sign - Public Transport 640 1237092 543382 NaN Sign - Public Transport 1 Panel 1237092 20186 Sign - Public Transport 1 Panel Bus Stop Type 8 P.16
219
geo_point_2d geo_shape he_loop lift pids station
0 -37.77839599999999, 145.031251 {"coordinates": [145.031251, -37.7783959999999... No No Dot Matrix Alphington
1 -37.86724899999996, 144.830604 {"coordinates": [144.830604, -37.8672489999999... No No LCD Altona
2 -37.761897999999974, 144.96056099999998 {"coordinates": [144.96056099999998, -37.76189... No No No Anstey

Visualizations

The following code blocks will briefly visualize the information in our dataset. This is just a initial visualization to understand the overall pattern in our data.

Population dataset

pop.head(5)
geography year value
5620 City of Melbourne 2021 0.498634
5621 City of Melbourne 2025 0.580116
5622 City of Melbourne 2028 0.691288
5623 City of Melbourne 2029 0.723781
5624 City of Melbourne 2033 0.836179

The code below takes the latitude and longitude from the datasets and finds their location on map. If a particular location is not found the code outputs a "CITY NOT FOUND" message. Finally a new dataframe is created to store the information for latitude and longitude and corresponding city name.


CITY NOT FOUND: Melbourne (Remainder), Victoria, Australia
CITY NOT FOUND: West Melbourne (Residential), Victoria, Australia
geography city_lat city_long
0 City of Melbourne -37.812382 144.948265
1 Carlton -37.800423 144.968434
2 Docklands -37.817542 144.939492
3 East Melbourne -37.812498 144.985885
4 Kensington -37.793938 144.930565
5 Melbourne -37.814171 144.965562
6 North Melbourne -37.807609 144.942351
7 Parkville -37.787115 144.951553
8 Port Melbourne -37.833361 144.921920
9 South Yarra -37.837770 144.991854
10 Southbank -37.825362 144.964020
11 Australia Post Sunshine West PDC -37.808367 144.813062

sns.move_legend(plot, "upper left", bbox_to_anchor=(1, 1))

Key findings from the plot:

This scatterplot visualizes population growth trends over time for various locations within Melbourne. Here's what can be understood from the plot:

  1. Dominant Growth in the City of Melbourne: The red line, representing the City of Melbourne, shows a strong upward trend, indicating a significant increase in population from 2020 to 2040. This area is experiencing more substantial growth compared to other locations.

  2. Stable Populations in Other Areas: The green line, representing Carlton, and other colored lines for various districts such as Docklands, East Melbourne, and others, show relatively stable populations over the same period. These lines are mostly flat, indicating little to no growth in population.

  3. Cluster of Locations with Minimal Change: Most of the other locations represented by various colors remain nearly constant over the years. This suggests that while the central area of Melbourne is growing, other districts are not experiencing similar population increases.

  4. Potential Focus for Infrastructure Development: Given the significant growth in the City of Melbourne, there might be an increased need for infrastructure and transportation services in this area to accommodate the rising population.

  5. Comparison of Growth Rates: The graph allows for an easy comparison of growth rates between the central business district and the outlying areas, highlighting areas that might require more attention from urban planners and policymakers.

Overall, this plot is crucial for planning and resource allocation, especially in targeting areas with rapid population growth for transportation and infrastructure development.icdpment.

This bar chart visualizes the overall population distribution across various geographic areas within Melbourne. Here’s what can be understood from the plot:

  1. Dominant Population in the City of Melbourne: The bar for the City of Melbourne is significantly longer than for any other area, indicating that it has a much higher population compared to other parts of the city.

  2. Comparatively Lower Populations in Other Areas: Areas like Carlton, Docklands, and Kensington have relatively smaller populations, as shown by the shorter bars.

  3. Minimal Population in Some Suburbs: Areas like Southbank, West Melbourne (both Industrial and Residential), and Port Melbourne have even smaller populations relative to the central and other urban areas, indicated by the shortest bars.

  4. Potential Areas for Infrastructure Focus: Given the high population in the City of Melbourne, this area might require more robust transportation and infrastructure services. Conversely, areas with smaller populations might be evaluated for potential growth or revitalization projects.

  5. Strategic Planning for Service Allocation: Urban planners and decision-makers can use this data to strategically plan resource allocation, ensuring that areas with higher populations are adequately served, potentially planning for expansion in areas showing a potential for growth.

Overall, this plot serves as a crucial tool for understanding how population is spread across Melbourne, guiding decisions related to urban planning, transportation services, and infrastructure development.

This box plot visualizes the distribution of population values over years for various locations within Melbourne. Here's what the graph tells us:

  1. Variability and Range:
    • City of Melbourne: Shows a high median value close to 0.8 and a range extending from about 0.6 to 1.0, indicating a consistently high population with little variation over the years.
    • Carlton: Has a lower median population value, around 0.2, with very little variation.
    • Docklands: Similar to Carlton, shows minimal variation with a slightly lower median population value.
  2. Outliers and Consistency:
    • No outliers are visible for any location, suggesting that population values are consistent within each area across the observed years.
    • Most locations except for the City of Melbourne have very compressed boxes, indicating that population values are stable and don’t fluctuate much.
  3. Comparisons:
    • Melbourne (CBD) and City of Melbourne exhibit higher population values compared to the other regions, with Melbourne (CBD) having noticeable variability in its distribution.
    • Locations like Southbank, West Melbourne (Industrial), and West Melbourne (Residential) have very low population values, close to 0, and show minimal variation, indicating stable but small populations.
  4. Interpretation for Planning:
    • The high values and variability in the City of Melbourne and Melbourne (CBD) suggest these areas have not only high populations but also potentially dynamic changes in population density over time, likely needing more dynamic and robust urban planning and transportation solutions.
    • The stability in population values in areas like Carlton, Docklands, and others suggests that while population demands may be lower, planning can focus on maintaining or slowly developing existing infrastructure.

Overall, this box plot is particularly useful for understanding where population pressures are greatest and where they are most stable, assisting city planners and policymakers in making informed decisions about where to allocate resources and how to plan for future development or conservation.

Tram dataset

tram.head(3)
stop_name lat_tram lon_tram
0 Market Street / Flinders Street -37.819223 144.961401
1 Victoria Police Centre / Flinders Street -37.821539 144.953569
2 Central Pier / Harbour Esplanade -37.815427 144.945121


map
Make this Notebook Trusted to load map: File -> Trust Notebook

As the tram data is limited to the central square we will no longer use this dataset in our usecase

Metro station dataset






map
Make this Notebook Trusted to load map: File -> Trust Notebook

Bus stop dataset


Make this Notebook Trusted to load map: File -> Trust Notebook

As the data for Tram stops is not sufficient we will no longer use tram data

Make this Notebook Trusted to load map: File -> Trust Notebook

Now the data is staionary as the p-value is below 0.05

2042-01-01    0.001606
2043-01-01    0.001054
2044-01-01    0.001233
2045-01-01    0.001175
2046-01-01    0.001194
Freq: AS-JAN, Name: predicted_mean, dtype: float64

Key findings from the plot.

Code that goes through all cities


Key findings from the plot.

1. DBSCAN (Density Based Spatial Clustering)

General approach explained/shown below

Make this Notebook Trusted to load map: File -> Trust Notebook

Key findings from the plot.

The plot above shows the clusters of bus stops with different colours and later on real world map. This is helpful in understanding how the bus stops are distributed within the city and potenatial areas of interest for future development.

2. Haversine Formula to find distance

underserved_distance_df
0 1 2 3 4 5 6 7 8 9 ... 37 38 39 40 41 42 43 44 45 46
0 0.000000 1.964545 2.605692 3.652704 6.895595 2.115379 1.873523 1.980960 3.647904 2.177803 ... 3.447816 3.699037 2.404622 1.953986 3.232718 3.349753 1.988170 3.642521 3.242434 2.121221
1 1.964545 0.000000 0.862403 5.357905 5.725009 0.523640 3.787898 0.032375 2.488723 0.619123 ... 5.122267 2.518697 0.805197 0.288873 2.119326 5.228514 0.400413 5.351613 2.119995 0.776542
2 2.605692 0.862403 0.000000 6.146014 6.084751 1.265593 4.476681 0.830551 1.709769 1.335583 ... 5.917246 1.730613 0.244921 0.695853 1.392195 5.945374 0.629202 6.138600 1.388683 0.559538
3 3.652704 5.357905 6.146014 0.000000 8.173400 5.251295 1.961359 5.384167 7.297522 5.262132 ... 0.258265 7.349403 5.978949 5.452847 6.883566 0.981970 5.517946 0.019532 6.893570 5.726557
4 6.895595 5.725009 6.084751 8.173400 0.000000 5.234256 7.869994 5.741710 7.606925 5.136704 ... 7.932268 7.603056 6.245052 5.992522 7.404850 8.725853 6.085721 8.181785 7.397035 6.410958
5 2.115379 0.523640 1.265593 5.251295 5.234256 0.000000 3.828361 0.548972 2.957687 0.098221 ... 5.006380 2.983183 1.275673 0.812217 2.605959 5.218645 0.922308 5.246818 2.605329 1.295189
6 1.873523 3.787898 4.476681 1.961359 7.869994 3.828361 0.000000 3.808405 5.434710 3.867979 ... 1.819885 5.488969 4.278065 3.812646 5.028177 1.481333 3.854608 1.946689 5.039425 3.993454
7 1.980960 0.032375 0.830551 5.384167 5.741710 0.548972 3.808405 0.000000 2.456910 0.643576 ... 5.148906 2.486769 0.773025 0.263506 2.088139 5.250875 0.373512 5.377809 2.088744 0.748495
8 3.647904 2.488723 1.709769 7.297522 7.606925 2.957687 5.434710 2.456910 0.000000 3.034830 ... 7.095550 0.058454 1.686901 2.232070 0.417102 6.905076 2.122394 7.286840 0.408922 1.771010
9 2.177803 0.619123 1.335583 5.262132 5.136704 0.098221 3.867979 0.643576 3.034830 0.000000 ... 5.015702 3.059245 1.357261 0.907073 2.687763 5.247484 1.016249 5.258009 2.686810 1.387397
10 3.349633 5.225195 5.944031 0.959205 8.708764 5.213046 1.482751 5.247659 6.908837 5.241482 ... 1.037900 6.963634 5.750991 5.270362 6.504754 0.023172 5.318233 0.939775 6.516252 5.470838
11 2.605422 1.490369 0.902812 6.257095 6.986316 1.998015 4.436285 1.460649 1.082806 2.086017 ... 6.048354 1.126662 0.756358 1.211003 0.672313 5.916120 1.097252 6.247153 0.677682 0.723378
12 1.888444 3.766008 4.242536 3.200500 8.673949 3.987805 1.377420 3.776194 4.776766 4.055777 ... 3.110217 4.834613 4.008911 3.685674 4.405896 2.441530 3.686926 3.183162 4.419577 3.693773
13 3.419497 5.095025 5.889580 0.280367 7.918997 4.980065 1.793195 5.121636 7.067226 4.989596 ... 0.028334 7.118497 5.726087 5.195340 6.652210 1.053551 5.262397 0.281360 6.661930 5.477971
14 6.398067 5.226534 5.599598 7.761954 0.502333 4.733933 7.400295 5.243554 7.140102 4.636282 ... 7.517611 7.137717 5.756028 5.495329 6.929357 8.287751 5.589363 7.769696 6.921852 5.917284
15 3.058493 4.882643 5.626934 0.787663 8.261705 4.842547 1.246706 4.906396 6.668694 4.866344 ... 0.753944 6.722282 5.442663 4.944404 6.259371 0.464208 4.998415 0.769543 6.270229 5.171987
16 3.382004 2.019774 1.201389 7.029773 7.075162 2.463999 5.220307 1.987475 0.537661 2.536320 ... 6.817676 0.545043 1.215758 1.782836 0.428401 6.700487 1.679198 7.020179 0.413865 1.360286
17 5.721497 5.161008 5.788275 6.206659 2.211076 4.638278 6.255699 5.187101 7.480752 4.545146 ... 5.981621 7.493931 5.871543 5.449881 7.180367 6.873996 5.560558 6.217458 7.176955 5.932460
18 1.988442 0.499152 0.617281 5.544727 6.182679 1.021811 3.859541 0.472855 2.042671 1.116009 ... 5.319092 2.077542 0.435431 0.210613 1.655828 5.329223 0.100071 5.536886 1.658054 0.283129
19 2.230139 1.001594 0.629495 5.861624 6.606517 1.518718 4.093437 0.973250 1.565506 1.610146 ... 5.645387 1.604885 0.396872 0.716591 1.165310 5.574140 0.602848 5.852554 1.169031 0.225495
20 2.727331 4.470104 4.823519 3.848951 9.588992 4.757246 2.236593 4.475347 5.058680 4.833652 ... 3.807178 5.117086 4.580733 4.345391 4.730818 2.961302 4.326824 3.830170 4.745720 4.264061
21 2.297553 0.825901 0.396501 5.893864 6.361458 1.324536 4.170486 0.794984 1.670286 1.411497 ... 5.671672 1.703151 0.152409 0.561796 1.293872 5.647148 0.453112 5.885542 1.294834 0.180337
22 2.273955 1.039889 0.625999 5.907232 6.627715 1.554886 4.135878 1.011105 1.518182 1.645575 ... 5.691338 1.557378 0.401759 0.756530 1.118826 5.616764 0.642651 5.898119 1.122398 0.266321
23 3.373368 2.019579 1.204055 7.021662 7.085735 2.465801 5.210515 1.987298 0.529282 2.538548 ... 6.809866 0.538203 1.215155 1.781241 0.412393 6.690564 1.677168 7.012034 0.397893 1.356150
24 3.061395 4.888211 5.631341 0.796153 8.275103 4.849555 1.246051 4.911903 6.669877 4.873607 ... 0.765930 6.723517 5.446656 4.949175 6.260739 0.450751 5.002902 0.777922 6.271624 5.175533
25 3.001721 4.727999 5.061577 4.000947 9.864996 5.023083 2.464372 4.732550 5.236883 5.100393 ... 3.973937 5.295146 4.817953 4.597239 4.919959 3.086350 4.575751 3.981883 4.935047 4.502087
26 3.971509 2.730833 1.914575 7.623043 7.668399 3.178099 5.766470 2.698610 0.335323 3.249989 ... 7.419246 0.278772 1.925743 2.487252 0.739931 7.238062 2.380927 7.612537 0.729594 2.044785
27 4.661525 4.548637 5.305764 4.770735 3.533046 4.055865 4.930908 4.578880 7.011088 3.976567 ... 4.547485 7.034669 5.331463 4.827759 6.661241 5.455998 4.941635 4.781711 6.660841 5.318092
28 4.110928 2.293540 1.508387 7.641870 6.214769 2.527046 5.983501 2.265456 1.655829 2.557760 ... 7.409529 1.624519 1.708278 2.192023 1.699551 7.453621 2.135908 7.634966 1.684790 2.007120
29 2.392312 1.227472 0.732337 6.037266 6.784782 1.740498 4.241953 1.198411 1.343230 1.830366 ... 5.824694 1.384565 0.540194 0.944975 0.938186 5.723246 0.831089 6.027759 0.942580 0.454964
30 2.491369 4.248072 4.619014 3.735797 9.352320 4.528191 2.054982 4.253926 4.907636 4.603812 ... 3.680274 4.966090 4.377157 4.128810 4.570437 2.877173 4.112878 3.717340 4.585149 4.060017
31 2.042856 3.848772 4.268870 3.490453 8.890213 4.105762 1.690192 3.856490 4.689088 4.178584 ... 3.408118 4.747350 4.030363 3.746226 4.331734 2.701038 3.738032 3.472806 4.345910 3.713213
32 2.200662 0.828237 0.508592 5.808248 6.426399 1.341205 4.072343 0.798967 1.697951 1.431598 ... 5.587817 1.733870 0.263684 0.548287 1.309058 5.550330 0.434730 5.799701 1.311367 0.081993
33 3.092739 1.739633 0.949892 6.737597 6.930871 2.198855 4.938042 1.707502 0.760664 2.274959 ... 6.524328 0.784365 0.934541 1.494681 0.473998 6.418989 1.389028 6.728146 0.466248 1.063513
34 4.062902 2.811675 1.989924 7.714464 7.707192 3.254656 5.857569 2.779407 0.424903 3.325493 ... 7.510628 0.369103 2.006955 2.570486 0.831336 7.328927 2.464785 7.703960 0.821018 2.131207
35 2.412964 0.824310 0.249837 5.990674 6.262406 1.295602 4.286475 0.792166 1.667277 1.377232 ... 5.766003 1.695807 0.019980 0.590839 1.310595 5.760684 0.495394 5.982682 1.309801 0.317819
36 2.005453 0.192016 0.997265 5.313177 5.544317 0.331624 3.794512 0.217929 2.658271 0.427329 ... 5.073914 2.686445 0.971371 0.480702 2.295840 5.218909 0.591421 5.307536 2.295969 0.966086
37 3.447816 5.122267 5.917246 0.258265 7.932268 5.006380 1.819885 5.148906 7.095550 5.015702 ... 0.000000 7.146819 5.753931 5.222953 6.680530 1.061034 5.290118 0.260593 6.690248 5.505982
38 3.699037 2.518697 1.730613 7.349403 7.603056 2.983183 5.488969 2.486769 0.058454 3.059245 ... 7.146819 0.000000 1.715309 2.264854 0.466515 6.959940 2.155878 7.338780 0.457382 1.807876
39 2.404622 0.805197 0.244921 5.978949 6.245052 1.275673 4.278065 0.773025 1.686901 1.357261 ... 5.753931 1.715309 0.000000 0.573756 1.330556 5.751738 0.479656 5.971005 1.329751 0.317377
40 1.953986 0.288873 0.695853 5.452847 5.992522 0.812217 3.812646 0.263506 2.232070 0.907073 ... 5.222953 2.264854 0.573756 0.000000 1.852364 5.272430 0.113885 5.445628 1.853911 0.491097
41 3.232718 2.119326 1.392195 6.883566 7.404850 2.605959 5.028177 2.088139 0.417102 2.687763 ... 6.680530 0.466515 1.330556 1.852364 0.000000 6.501348 1.740457 6.873003 0.015654 1.378238
42 3.349753 5.228514 5.945374 0.981970 8.725853 5.218645 1.481333 5.250875 6.905076 5.247484 ... 1.061034 6.959940 5.751738 5.272430 6.501348 0.000000 5.319849 0.962549 6.512884 5.470972
43 1.988170 0.400413 0.629202 5.517946 6.085721 0.922308 3.854608 0.373512 2.122394 1.016249 ... 5.290118 2.155878 0.479656 0.113885 1.740457 5.319849 0.000000 5.510418 1.742191 0.377366
44 3.642521 5.351613 6.138600 0.019532 8.181785 5.246818 1.946689 5.377809 7.286840 5.258009 ... 0.260593 7.338780 5.971005 5.445628 6.873003 0.962549 5.510418 0.000000 6.883035 5.718032
45 3.242434 2.119995 1.388683 6.893570 7.397035 2.605329 5.039425 2.088744 0.408922 2.686810 ... 6.690248 0.457382 1.329751 1.853911 0.015654 6.512884 1.742191 6.883035 0.000000 1.381048
46 2.121221 0.776542 0.559538 5.726557 6.410958 1.295189 3.993454 0.748495 1.771010 1.387397 ... 5.505982 1.807876 0.317377 0.491097 1.378238 5.470972 0.377366 5.718032 1.381048 0.000000

47 rows × 47 columns

This heatmap displays a distance matrix for underserved bus stops, where each cell represents the distance between pairs of bus stops, indexed from 0 to about 46. The colors indicate the magnitude of the distance, with darker colors (closer to blue and purple) indicating shorter distances and lighter colors (towards yellow) indicating greater distances. Here's what the heatmap tells us:

  1. Close Proximity Patterns: The clusters of darker colors (blues and purples) suggest that certain bus stops are located very close to each other. This could be indicative of densely populated areas or regions where stops are closely spaced.

  2. Distance Variation: There are sporadic patches of lighter colors (yellows and greens), indicating longer distances between certain bus stops. This variation can signal areas where service might be less frequent or accessible, potentially indicating regions that are underserved and might benefit from additional stops or enhanced services.

  3. Diagonal Line of Darkness: The diagonal from the top left to the bottom right, which is uniformly dark, represents the distance from each bus stop to itself, which is naturally zero. This is a standard feature in distance matrices.

  4. Identifying Potential Service Gaps: The areas with lighter squares between darker clusters could be key locations for adding new bus stops or improving transit connectivity. These gaps in service might represent physical barriers (like rivers or highways) or simply areas that have been overlooked in transit planning.

  5. Strategic Planning: Planners can use this information to optimize routes, considering where to place new stops or how to reroute existing lines to reduce overall travel times and improve service coverage in underserved areas.

Overall, this heatmap provides a visual tool for identifying how evenly distributed and accessible bus transportation services are across a region, highlighting both well-served areas and potential gaps in service.

Make this Notebook Trusted to load map: File -> Trust Notebook

Based on the map visualization showing bus stop distribution with red markers for underserved stops and purple markers for other stops, and considering the violet lines representing distances greater than 7 kilometers between bus stops, we can deduce several important pieces of information about current bus stop distribution and potential locations for new stops:

  1. High Concentration Areas: The central area, particularly around the Melbourne CBD, has a high concentration of bus stops, both underserved (red) and otherwise (purple). This suggests a well-serviced urban core which might not require additional stops but could benefit from other types of service improvements such as increased frequency or extended hours.

  2. Long Distance Between Stops: The violet lines, indicating distances greater than 7 kilometers between certain stops, show significant gaps in service coverage. These lines, especially those stretching out to areas with fewer or no stops, highlight regions where new stops could dramatically improve service.

  3. Identifying Underserved Regions: The red markers signify underserved bus stops, potentially due to their isolation from other stops or because they serve areas with insufficient coverage. Placing new stops between these red-marked underserved stops and the nearest purple stops could enhance connectivity.

  4. Spatial Distribution of Service Gaps: The distribution of violet lines shows not just linear distances but also the geographic spread of areas lacking adequate service. The southern and eastern peripheries of the map, for instance, show longer distances between stops, suggesting that these are likely areas where public transit accessibility could be improved.

  5. Strategic Placement for New Stops: The areas without any markers or with sparse coverage by purple markers are potential zones for developing new bus stops. Strategic placement in these zones can help bridge the long gaps shown by the violet lines, ensuring a more evenly distributed transit network that can cater to a larger population and reduce transit deserts.

  6. Optimization Opportunities: Analyzing the overlap of long-distance lines and existing bus stops can offer insights into optimizing current routes. Reconfiguring some routes might address several gaps without necessarily adding many new stops, especially if existing stops are underutilized or inefficiently positioned.

This map is a valuable tool for transit planners aiming to enhance bus service coverage, efficiency, and accessibility in Melbourne. It illustrates not only where services are currently lacking but also how existing resources might be better utilized or expanded to meet the city’s transportation needs more effectively.

Make this Notebook Trusted to load map: File -> Trust Notebook

Based on the map visualization showing bus stop distribution where purple lines indicate the bus stops that are less than 1 kilometres apart, we can draw several conclusions about the distribution of bus stops and the identification of potential locations for new stops:

  1. Current Bus Stop Clustering: The map displays significant clustering of bus stops in the central areas of Melbourne. These clusters are indicated by dense groups of purple markers (each representing a bus stop), particularly around the Melbourne CBD and adjacent suburbs. This indicates a high concentration of services in these areas.

  2. Distance Threshold Visualization: The violet lines connecting bus stops represent pairs of stops that are less than 1 kilometer apart. This visualization emphasizes areas with a high density of stops where the services are closely spaced.

  3. Identifying Overserviced Areas: In regions where there are many violet lines, indicating numerous stops close to each other, it might suggest that these areas are potentially overserviced. The clustering in such areas might be more than necessary, leading to redundancy in services.

  4. Gaps in Service Coverage: Contrasting the dense areas, there are regions on the map, particularly towards the west and southeast of the city center (towards Port Melbourne and beyond Southbank), where there are fewer purple markers and very few to no violet lines. This indicates areas with lower service coverage, where bus stops are sparse or more dispersed.

  5. Potential for New Bus Stops: The areas lacking violet lines and having sparse purple markers are potential candidates for new bus stops. Establishing new stops in these areas would increase accessibility and service coverage, effectively filling gaps in the existing public transport network.

  6. Strategic Placement for New Stops: To optimize service and coverage, new stops could be strategically placed in underserved areas to bridge large gaps between existing stops. This would not only expand the reach of the bus network but also enhance connectivity between outlying areas and central hubs, facilitating easier and faster commuting options for residents.

  7. Enhancing Transit Connectivity: Adding new stops in the identified gap areas could also help in creating more direct routes or reducing travel times for existing routes, which might currently be circuitous due to the lack of intermediate stops.

Overall, this analysis suggests a focused approach to transit planning where new investments could be targeted in underserved areas, while possibly reevaluating the necessity of closely spaced stops in highly serviced areas. This strategy would lead to a more balanced and efficiently serviced public transportation system across Melbourne.

Make this Notebook Trusted to load map: File -> Trust Notebook

This map visualizes metro stations in and around Melbourne, with green markers representing regularly served stations and red markers highlighting stations identified as underserved, which received a clustering label of -1. This labeling typically indicates that these stations are outliers or anomalous within the clustering model used, possibly due to their distinct characteristics or locations. Here's what can be inferred from this plot:

  1. Distribution of Underserved Stations: The red markers, signifying underserved metro stations, are predominantly located on the outskirts of the metropolitan area. This spatial pattern suggests that peripheral or less centrally located stations are not as well served as those closer to the city center.

  2. Central Cluster of Well-Served Stations: The dense clustering of green markers around the central and northeastern parts of the city indicates a high concentration of well-served metro stations. These areas likely benefit from more frequent services, better connectivity, and potentially more passenge

  3. Potential for Transit Deserts: Areas without nearby metro stations can become transit deserts, where residents are forced to rely heavily on private vehicles or other less efficient forms of transportation. This reliance can lead to increased traffic congestion, higher transportation costs, and greater environmental impacts due to increased vehicle emissions.

  4. Strategic Development and Investment: For urban development and transportation planning, this map can guide where investments might be most needed to elevate the quality and reach of public transport services. Focusing on underserved stations could also help in promoting more balanced urban growth and reducing congestion in over-served areas.

Overall, this visualization aids in understanding how metro services are distributed across Melbourne, highlighting areas where strategic interventions can improve public transit accessibility and efficiency.nd efficiency.

underserved_distance_df
0 1 2 3 4 5
0 0.000000 6.929773 1.714985 3.917050 1.874835 40.203942
1 6.929773 0.000000 8.211514 3.656398 5.389960 36.281164
2 1.714985 8.211514 0.000000 5.534630 3.554640 40.016363
3 3.917050 3.656398 5.534630 0.000000 2.069923 39.323995
4 1.874835 5.389960 3.554640 2.069923 0.000000 39.967367
5 40.203942 36.281164 40.016363 39.323995 39.967367 0.000000
plt.figure(figsize=(10, 8))  # Set the figure size for the heatmap
sns.heatmap(underserved_distance_df, annot=True, fmt=".1f", cmap='viridis', linewidths=.5)
plt.title('Underserved Metro stops Distance Matrix Heatmap')
plt.xlabel('Metro Stop Index')
plt.ylabel('Metro Stop Index')
plt.show()

This heatmap displays a distance matrix for underserved metro stops, where each cell represents the distance in kilometers between pairs of metro stops. The distances are color-coded, with different colors representing different ranges of distances:

  1. Zero Distance: The diagonal cells, where the metro stop index matches on both axes (e.g., 0-0, 1-1), show a distance of 0.0 kilometers, naturally indicating the distance from a stop to itself.

  2. Short Distances: Purple cells, such as the distance between metro stops 0 and 2 (1.7 km) or between 0 and 4 (3.9 km), represent shorter distances. These stops are relatively close to each other, suggesting that they could potentially serve overlapping areas.

  3. Moderate Distances: Darker purple and blue cells show moderate distances between stops, which could indicate stops that are adequately spaced, providing service to distinct areas without significant overlap.

  4. Long Distances: Bright yellow cells, particularly notable between metro stop 5 and others like 0, 1, and 3 (40.2 km), indicate very long distances between these stops. This suggests that these areas are significantly underserved and could benefit from additional stops or better transportation options to bridge the gap between existing stoices.

    • Route Optimization: Understanding the distances between stops can aid in optimizing routes for efficiency and coverage. For example, planning new routes or adjusting existing ones could address areas marked by longer distances on the erved areas.

Overall, the heatmap provides valuable insights into the spatial distribution and connectivity of metro services, especially highlighting critical gaps that need addressing to enhance the effectiveness and reach of public transportation.

From the map depicting the metro stations and the lines indicating distances greater than 10 kilometers between stations, several insights can be drawn regarding the distribution and service of underserved metro stations from the map above:

  1. Spatial Isolation of Underserved Stations: The teal lines highlight that some metro stations are more than 10 kilometers apart from their nearest neighbors. These lines predominantly connect stations on the outskirts of the map, especially towards the southeast. This suggests a significant spatial isolation of these stations, indicating they are likely underserved in terms of frequency of service and connectivity.

  2. Potential for Expansion in Coverage: The presence of long distances between certain stations implies that there are large areas within the city that could benefit from either new metro stations or improved service to existing stations. This would help in reducing the transit gaps and enhancing accessibility for residents in these outlying areas.

  3. Focus Areas for Development: The areas where these long-distance lines are concentrated, particularly the corridors extending southeast, highlight specific regions where transit infrastructure development could be prioritized. Developing these areas could involve adding new routes, increasing the frequency of existing services, or constructing new stations to bridge the large gaps.

  4. Equity in Transit Services: Ensuring that these underserved and isolated stations receive improved services could also address issues of equity in public transportation access. Residents in these areas may currently face longer travel times and less frequent service, impacting their mobility and access to city resources.

Overall, this map provides a clear visual tool for identifying gaps in metro service coverage and can help guide future investments in public transportation infrastructure to achieve a more balanced and effective transit network across the metropolitan area.

The map help in visualizing the distribution of metro stations across a geographic area, highlighting the proximity of stations to each other based on a specified distance threshold (in this case, 5 kilometers). Here's what can be understood and how optimization might be approached: p:

  1. Distance Highlighting: The code uses teal lines to connect metro stations that are less than 5 kilometers apart. This visualization helps identify clusters of metro stations that are densely packed, as well as possibly indicating areas where stations might be redundantly close to each other.

  2. Station Distribution: The distribution of metro stations, as shown by the blue circle markers, provides a visual representation of how well different areas are served. The clustering of lines in certain areas suggests good local connectivity, whereas areas with fewer or no lines might be underserved.

  3. Areas of Potential Over-service and Under-service: Dense clusters of teal lines might suggest areas where stations could be considered excessive or too closely spaced. Conversely, large areas without teal lines or stations indicate regions that might be underserved and could benefit from the addition of new stations or the extension of existingS4rategies:

  4. Reviewing Station Spacing: The areas with dense teal lines can be reviewed to determine if some stations are unnecessarily close, which might be inefficient in terms of operational costs and resource allocation. Removing or consolidating some stations might improve the overall efficiency of th5 network.

  5. Expanding Coverage: For regions on the map devoid of stations or connectivity (teal lines), strategic placement of new metro stations can help expand the reach of public transport, making it more accessible to wider populations. This can help in reducing transit deserts and improving connectivity for subud rural

    rOfurbished statioerall, sthe map and code provide a foundational tool for analyzing metro station distribution and identifying key areas for potential optimization to enhance service efficiency, coverage, and passenger satisfaction.

Conclusion

Concluding the analysis of metro and bus transportation systems in the City of Melbourne, several key insights and actionable recommendations have been developed to optimize public transit and meet the demands of a growing urban population:

  1. Strategic Expansion and Optimization: The project has identified critical gaps in transit coverage, particularly in peripheral and rapidly growing suburban areas. To address these gaps, the strategic placement of new metro and bus stations is recommended. Additionally, the optimization of existing routes and the potential consolidation of underutilized stations could significantly enhance service efficiency and reduce operational costs.

  2. Integration of Multimodal Transportation: A significant finding of this project is the need for better integration between different modes of public transport. Creating multimodal transportation hubs that connect metro, bus, and other transportation modes can significantly improve the usability of the public transit system, making it more convenient for users to travel across different parts of the city and beyond.

  3. Data-Driven Planning: Utilizing advanced data analytics and geographical information systems (GIS) has proven invaluable in understanding patterns of service use and area demographics. Continuing to leverage data will be crucial in making informed decisions that respond dynamically to future changes in population and urban development.

  4. Community Engagement and Feedback: Engaging with local communities to gather feedback on public transportation has highlighted the need for more user-focused services. Continued engagement through surveys, public forums, and feedback channels should be a priority to ensure that the transit system evolves in alignment with user needs and expectations.

In conclusion, this project provides a comprehensive roadmap for enhancing Melbourne's public transit system. By focusing on strategic development, integration, and sustainability, Melbourne can ensure that its transportation network meets current demands and is prepared for future challenges, ultimately making the city more livable and accessible for all its residents.

Reference List

Abba, IV 2023, How to Rename a Column in Pandas – Python Pandas Dataframe Renaming Tutorial, freeCodeCamp.org, viewed 14 April 2024, https://www.freecodecamp.org/news/how-to-rename-a-column-in-pandas/.

City of Melbourne Open Data Team 2018, Metro Train Stations with Accessibility Information, data.melbourne.vic.gov.au, viewed 13 April 2024, https://data.melbourne.vic.gov.au/explore/dataset/metro-train-stations-with-accessibility-information/information/.

Creating multiple subplots using plt.subplots — Matplotlib 3.4.1 documentation n.d., matplotlib.org.

Dejan, G 2021, The Magic of Haversine Distance, Ninja Van Tech.

Demo of DBSCAN clustering algorithm n.d., scikit-learn.

Duke University 2024, Introduction to ARIMA models, Duke.edu, viewed 1 May 2024, https://people.duke.edu/~rnau/411arim.htm#:~:text=ARIMA(p%2Cd%2Cq)%20forecasting%20equation%3A%20ARIMA.

GeoDev 2023, Discover the power of GeoPandas for interactive map creation and geospatial data analysis, www.youtube.com, viewed 3 May 2024, https://www.youtube.com/watch?v=xWB0w88cFO0&list=PLyWyQBSWLw1OF3RGVWToPkqe_pfx4TcPW&index=8.

geopandas.GeoDataFrame — GeoPandas 0.14.4+0.g60c9773.dirty documentation n.d., geopandas.org, viewed 3 May 2024, https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.html.

Hogg, G 2022, DBSCAN Clustering Algorithm Explained Simply, www.youtube.com, viewed 14 May 2024, https://www.youtube.com/watch?v=Lh2pAkNNX1g&t=180s.

How Kernel Density works—ArcGIS Pro | Documentation n.d., pro.arcgis.com.

Jason Brownlee 2017, How to Create an ARIMA Model for Time Series Forecasting in Python, Machine Learning Mastery, viewed 1 May 2024, https://machinelearningmastery.com/arima-for-time-series-forecasting-with-python/.

Kanade, V 2022, What Is Spatial Analysis? Definition, Working, and Examples, Spiceworks.

matplotlib.pyplot.subplots — Matplotlib 3.6.0 documentation n.d., matplotlib.org.

NumPy — NumPy 2009, Numpy.org, viewed 14 April 2024, https://numpy.org.

numpy.ravel — NumPy v1.20 Manual n.d., numpy.org.

Olumide, S 2023, Dataframe Drop Column in Pandas – How to Remove Columns from Dataframes, freeCodeCamp.org, viewed 13 April 2024, https://www.freecodecamp.org/news/dataframe-drop-column-in-pandas-how-to-remove-columns-from-dataframes/#:~:text=Method%20in%20Pandas-.

Pandas 2014, User Guide — pandas 1.0.1 documentation, Pydata.org, viewed 14 April 2024, https://pandas.pydata.org/docs/user_guide/index.html.

pandas (software) 2020, Wikipedia, viewed 14 April 2024, https://en.wikipedia.org/wiki/Pandas_(software).

pandas.DataFrame.drop — pandas 1.3.2 documentation n.d., pandas.pydata.org, viewed 14 April 2024, https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.drop.html.

PiotrDev 2023, Reverse Geocode with Python, real example with reverse_geocode., Medium, viewed 29 April 2024, https://medium.com/@pether.maciejewski/reverse-geocode-with-python-real-example-with-reverse-geocode-917c0357b32f.

Prachisaini 2023, Geographical Visualization in Python: Mapping with Folium and Geopy, Medium, viewed 19 April 2024, https://medium.com/@prachi1808saini/geographical-visualization-in-python-mapping-with-folium-and-geopy-51586c49a341.

python mpl_toolkits installation issue n.d., Stack Overflow, viewed 8 May 2024, https://stackoverflow.com/questions/37661119/python-mpl-toolkits-installation-issue.

Reverse Geocoding Python to Convert Latitude Longitude to Address using Geopy Library n.d., www.youtube.com, viewed 29 April 2024, https://www.youtube.com/watch?v=mhTkaH2YuAc.

seaborn.boxplot — seaborn 0.11.1 documentation n.d., seaborn.pydata.org.

seaborn.scatterplot — seaborn 0.11.1 documentation n.d., seaborn.pydata.org.

Shatnawi, N, Al-Omari, AA & Al-Qudah, H 2020, ‘Optimization of Bus Stops Locations Using GIS Techniques and Artificial Intelligence’, Procedia Manufacturing, vol. 44, pp. 52–59.

sklearn.cluster.DBSCAN n.d., scikit-learn, viewed 14 May 2024, https://scikit-learn.org/stable/modules/generated/sklearn.cluster.DBSCAN.html#:~:text=DBSCAN%20-%20Density-Based%20Spatial%20Clustering.

sklearn.metrics.pairwise.haversine_distances n.d., scikit-learn.

Spatial Raster Metadata: CRS, Resolution, and Extent in Python 2018, Earth Data Science - Earth Lab, viewed 3 May 2024, https://www.earthdatascience.org/courses/use-data-open-source-python/intro-raster-data-python/fundamentals-raster-data/raster-metadata-in-python/#:~:text=1.-.

Starmer, J 2022, Clustering with DBSCAN, Clearly Explained!!!, www.youtube.com.

UCI Machine Learning Repository n.d., archive.ics.uci.edu.

Understanding density analysis—ArcGIS Pro | Documentation n.d., pro.arcgis.com.

Welcome to GeoPy’s documentation! — GeoPy 1.21.0 documentation n.d., geopy.readthedocs.io.

Wikipedia Contributors 2019, NumPy, Wikipedia, Wikimedia Foundation, viewed 14 April 2024, https://en.wikipedia.org/wiki/NumPy.

Yusuf, S 2023, Data Visualization with Python (9): Generating Maps with Folium, Medium.